Linguistic Features for Quality Estimation

نویسندگان

  • Mariano Felice
  • Lucia Specia
چکیده

This paper describes a study on the contribution of linguistically-informed features to the task of quality estimation for machine translation at sentence level. A standard regression algorithm is used to build models using a combination of linguistic and non-linguistic features extracted from the input text and its machine translation. Experiments with EnglishSpanish translations show that linguistic features, although informative on their own, are not yet able to outperform shallower features based on statistics from the input text, its translation and additional corpora. However, further analysis suggests that linguistic information is actually useful but needs to be carefully combined with other features in order to produce better results.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Linguistic Indicators for Quality Estimation of Machine Translations

This work presents a study of linguistically-informed features for the automatic quality estimation of machine translations. In particular, we address the problem of estimating quality when no reference translations are available, as this is the most common case in real world situations. Unlike previous attempts that make use of internal information from translation systems or rely on purely sh...

متن کامل

Quality estimation for Machine Translation output using linguistic analysis and decoding features

We describe a submission to the WMT12 Quality Estimation task, including an extensive Machine Learning experimentation. Data were augmented with features from linguistic analysis and statistical features from the SMT search graph. Several Feature Selection algorithms were employed. The Quality Estimation problem was addressed both as a regression task and as a discretised classification task, b...

متن کامل

Error Detection for Statistical Machine Translation Using Linguistic Features

Automatic error detection is desired in the post-processing to improve machine translation quality. The previous work is largely based on confidence estimation using system-based features, such as word posterior probabilities calculated from N best lists or word lattices. We propose to incorporate two groups of linguistic features, which convey information from outside machine translation syste...

متن کامل

A CCG-based Quality Estimation Metric for Statistical Machine Translation

We describe a metric for estimating the quality of Statistical Machine Translation (SMT) output based on syntactic features extracted using Combinatory Categorial Grammar (CCG). CCG has been demonstrated to be better suited to deal with SMT texts than context free phrase structure grammar formalisms. We use CCG features to estimate the grammaticality of the translations by dividing them into ma...

متن کامل

The Effect of Genre Awareness on English Translation Quality and Pedagogy: A Case of News Reports Translation as an Academic Curriculum

To produce an adequate translation, language students are required to learn varieties of language features including syntax, semantics and pragmatics. Considering the curriculum language learners are face with, one can claim that almost all language students in Iran are taught these features in their academic settings including linguistic courses. Yet, there are some aspects of language which a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012